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Abstract 

The precise formulation of derivation for tree-adjoining grammars has impor- 
tant ramifications for a wide variety of uses of the formalism, from syntactic analysis 
to semantic interpretation and statistical language modeling. We argue that the 
definition of tree-adjoining derivation must be reformulated in order to manifest 
the proper linguistic dependencies in derivations. The particular proposal is both 
precisely characterizable through a definition of TAG derivations as equivalence 
classes of ordered derivation trees, and computationally operational, by virtue of 
a compilation to linear indexed grammars together with an efficient algorithm for 
recognition and parsing according to the compiled grammar. 



This paper is to appear in Computational Linguistics, volume 20, number 1, and is available 
from the Center for Research in Computing Technology, Division of Applied Sciences, Harvard 
University as Technical Report TR-08-92 and through the Computation and Language e-print 
archive as :mp-lg/9404001. 
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1 Introduction 



In a context-free grammar, the derivation of a string in the rewriting sense can be 
captured in a single canonical tree structure that abstracts all possible derivation orders. 
As it turns out, this derivation tree also corresponds exactly to the hierarchical structure 
that the derivation imposes on the string, the derived tree structure of the string. The 
formalism of tree-adjoining grammars (TAG), on the other hand, decouples these two 
notions of derivation tree and derived tree. Intuitively, the derivation tree is a more 
finely grained structure than the derived tree, and as such can serve as a substrate 
on which to pursue further analysis of the string. This intuitive possibility is made 
manifest in several ways. Fine-grained syntactic analysis can be pursued by imposing 
on the derivation tree further combinatorial constraints, for instance, selective adjoining 
constraints or equational constraints over feature structures. Statistical analysis can be 
explored through the specification of derivational probabilities as formalized in stochastic 
tree-adjoining grammars. Semantic analysis can be overlaid through the synchronous 
derivations of two TAGs. 

All of these methods rely on the derivation tree as the source of the important 
primitive relationships among trees. The decoupling of derivation trees from derived 
trees thus makes possible a more flexible ability to pursue these types of analyses. At 
the same time, the exact definition of derivation becomes of paramount importance. 
In this paper, we argue that previous definitions of tree-adjoining derivation have not 
taken full advantage of this decoupling, and are not as appropriate as they might be 
for the kind of further analysis that tree-adjoining analyses could make possible. In 
particular, the standard definition of derivation, due to Vijay-Shanker (1987), requires 
that auxiliary trees be adjoined at distinct nodes in elementary trees. However, in certain 
cases, especially cases characterized as linguistic modification, it is more appropriate to 
allow multiple adjunctions at a single node. 

In this paper, we propose a redefinition of TAG derivation along these lines, whereby 
multiple auxiliary trees of modification can be adjoined at a single node, whereas only 
a single auxiliary tree of predication can. The redefinition constitutes a new definition 
of derivation for TAG that we will refer to as extended derivation. In order for such 
a redefinition to be serviceable, however, it is necessary that it be both precise and 
operational. In service of the former, we provide a formal definition of extended deriva- 
tion using a new approach to representing derivations as equivalence classes of ordered 
derivation trees. With respect to the latter, we provide a method of compilation of 
TAGs into corresponding linear indexed grammars (LIG), which makes the derivation 
structure explicit, and show how the generated LIG can drive a parsing algorithm that 
recovers, either implicitly or explicitly, the extended derivations of the string. 

The paper is organized as follows. First, we review Vijay-Shanker's standard defini- 
tion of TAG derivation, and introduce the motivation for extended derivations. Then, 
we present the extended notion of derivation and its formal definition. The original 
compilation of TAGs to LIGs provided by Vijay-Shanker and Weir and our variant for 
extended derivations are both described. Finally, we discuss a parsing algorithm for 
TAG that operates by a variant of Ear ley parsing on the corresponding LIG. The set 
of extended derivations can subsequently be recovered from the set of Earley items gen- 
erated by the algorithm. The resultant algorithm is further modified so as to build an 
explicit derivation tree incrementally as parsing proceeds; this modification, which is a 
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Figure 1: A sample tree-adjoining grammar 

novel result in its own right, allows the parsing algorithm to be used by systems that 
require incremental processing with respect to tree-adjoining grammars. 

2 The Standard Definition of Derivation 

To exemplify the distinction between standard and extended derivations, we exhibit the 
TAG of Figure This grammar derives some simple noun phrases such as "roasted 
red pepper" and "baked red potato" . The former, for instance, is associated with the 
derived tree in Figure |(a). The tree can be viewed as being derived in two waysn 

Dependent: The auxiliary tree (3 ro is adjoined at the root node (address e)0 of f3 re . 
The resultant tree is adjoined at the N node (address 1) of initial tree a pe . This 
derivation is depicted as the derivation tree in Figure ||a). 

Independent: The auxiliary trees p ro and f3 re are adjoined at the N node (address 
1) of the initial tree a pe . This derivation is depicted as the derivation tree in 
Figure |(b). 

In the independent derivation, two trees are separately adjoined at one and the same 
node in the initial tree. In the dependent derivation, on the other hand, one auxiliary 
tree is adjoined to the other, the latter only being adjoined to the initial tree. We will 
use this informal terminology uniformly in the sequel to distinguish the two general 
topologies of derivation trees. 

x Here and elsewhere, we conventionally use the Greek letter a and its subscripted and primed variants 
for initial trees, /3 and its variants for auxiliary trees, and 7 and its variants for elementary trees in 
general. The foot node of an auxiliary tree is marked with an asterisk ('*'). 

2 We ignore here the possibility of another dependent derivation wherein adjunction occurs at the 
foot node of an auxiliary tree. Because this introduces yet another systematic ambiguity, it is typically 
disallowed by stipulation in the literature on linguistic analyses using TAGs. 

3 The address of a node in a tree is taken to be its Gorn number, that sequence of integers specifying 
which branches to traverse in order starting from the root of the tree to reach the node. The address of 
the root of the tree is therefore the empty sequence, notated e. See the appendix for a more complete 
discussion of notation. 
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The standard definition of derivation, as codified by Vijay-Shanker, restricts deriva- 
tions so that two adjunctions cannot occur at the same node in the same elementary 
tree. The dependent notion of derivation (Figure ||(a)) is therefore the only sanctioned 
derivation for the desired tree in Figure ||(a); the independent derivation (Figure ||(b)) 
is disallowed. Vijay-Shanker's definition is appropriate because for any independent 
derivation, there is a dependent derivation of the same derived tree. This can be easily 
seen in that any adjunction of 02 at a node at which an adjunction of fi\ occurs could 
instead be replaced by an adjunction of /3 2 at the root of (3\. 

The advantage of this standard definition of derivation is that a derivation tree in 
this normal form unambiguously specifies a derived tree. The independent derivation 
tree on the other hand is ambiguous as to the derived tree it specifies in that a notion 
of precedence of the adjunctions at the same node is unspecified, but crucial to the 
derived tree specified. This follows from the fact that the independent derivation tree is 
symmetric with respect to the roles of the two auxiliary trees (by inspection), whereas 
the derived tree is not. By symmetry, therefore, it must be the case that the same 
independent derivation tree specifies the alternative derived tree in Figure [^(b). 



3 Motivation for an Extended Definition of Deriva- 
tion 

In the absence of some further interpretation of the derivation tree nothing hinges on the 
choice of derivation definition, so that the standard definition disallowing independent 
derivations is as reasonable as any other. However, tree-adjoining grammars are almost 
universally extended with augmentations that make the issue apposite. We discuss 
three such variations here, all of which argue for the use of independent derivations 
under certain circumstances.n 



3.1 Adding Adjoining Constraints 



Already in very early work on tree-adjoining grammars (Joshi, Levy, and Takahashi 



1975) constraints were allowed to be specified as to whether a particular auxiliary tree 
may or may not be adjoined at a particular node in a particular tree. The idea is formu- 
lated in its modern variant as selective- adjoining constraints flVijay-Shankcr and JoshT 
1985). As an application of this capability, we consider the traditional grammatical view 



that directional adjuncts can be used only with certain verbs.^J This would account for 
the felicity distinctions between the following sentences: 

4 The formulation of derivation for tree-adjoining grammars is also of significance for other g r animat- 
es] formalisms bas ed on weaker forms of adjunc tion such as lexicalizcd cont ext-free grammar (Schabes 



,nd Waters, 1993a) and its stochastic extension (schabes and Waters, 1993t), though we do not discuss 

these arguments here. 

5 For instance, Quirk et al. (1985, page 517) remark that "direction adjuncts of both goal and source 
can normally be used only with verbs of motion" . Although the restriction is undoubtedly a semantic 
one, we will examine the modeling of it in a TAG deriving syntactic trees for two reasons. First, 
the problematic nature of independent derivation is more easily seen in this way. Second, much of the 
intuition behind TAG analyses is based on a tight relationship between syntactic and semantic structure. 
Thus, whatever scheme for semantics is to be used with TAGs will require appropriate derivations to 
model these data. For example, an analysis of this phenomenon by adjoining constrai nts n n the semantic 
half of a synchronous TAG would be subject to the identical argument. See Section p.3[ 
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(1) a. Brockway walked his Labrador towards the yacht club. 

b. jj= Brockway resembled his Labrador towards the yacht club. 

This could be modeled by disallowing through selective adjoining constraints the 
adjunction of the elementary tree corresponding to a towards adverbial at the VP node 
of the elementary tree corresponding to the verb resem6/es.[] However, the restriction 
applies even with intervening (and otherwise acceptable) adverbials. 

(2) a. Brockway walked his Labrador yesterday. 

b. Brockway walked his Labrador yesterday towards the yacht club. 



(3) a. Brockway resembled his Labrador yesterday. 

b. # Brockway resembled his Labrador yesterday towards the yacht club. 

Under the standard definition of derivation, there is no direct adjunction in the latter 
sentence of the towards tree into the resembles tree. Rather, it is depcndently adjoined 
at the root of the elementary tree that heads the adverbial yesterday, the latter directly 
adjoining into the main verb tree. To restrict both of the ill-formed sentences, then, 
a restriction must be placed not only on adjoining the goal adverbial in a resembles 
context, but also in the yesterday adverbial context. But this constraint is too strong, 



as it disallows sentence (2b) above as well. 

The problem is that the standard derivation does not correctly reflect the syntactic 
relation between the adverbial modifier and the phrase it modifies when there are multi- 
ple modifications in a single clause. In such a case, each of the adverbials independently 
modifies the verb, and this should be reflected in their independent adjunction at the 
same point. But this is specifically disallowed in a standard derivation. 

Another example along the same lines follows from the requirement that tense as 
manifested in a verb group be consistent with temporal adjuncts. For instance, consider 
the following examples: 

(4) a. Brockway walked his Labrador yesterday, 
b. Brockway will walk his Labrador yesterday. 

(5) a. # Brockway walked his Labrador tomorrow, 
b. Brockway will walk his Labrador tomorrow. 

Again, the relationship is independent of other intervening adjuncts. 

(6) a. Brockway walked his Labrador towards the yacht club yesterday, 
b. # Brockway will walk his Labrador towards the yacht club yesterday. 

(7) a. # Brockway walked his Labrador towards the yacht club tomorrow. 



^Whether the adjunction occurs at the VP node or the S node is immaterial to the argument. 



G 



b. Brockway will walk his Labrador towards the yacht club tomorrow. 



It is important to note that these arguments apply specifically to auxiliary trees that 
correspond to a modification relationship. Auxiliary trees are used in TAG typically 
for predication relations as well,]] as in the case of raising and sentential complement 
constructions^] Consider the following sentences. (The brackets mark the leaves of the 
pertinent trees to be combined by adjunction in the assumed analysis.) 

(8) a. Brockway assumed that Harrison wanted to walk his Labrador. 

b. [Brockway assumed that] [Harrison wanted] [to walk his Labrador] 

(9) a. Brockway wanted to try to walk his Labrador. 

b. [Brockway wanted] [to try] [to walk his Labrador] 

(10) a. * Harrison wanted Brockway tried to walk his Labrador. 

b. * [Harrison wanted] [Brockway tried] [to walk his Labrador] 

(11) a. Harrison wanted to assume that Brockway walked his Labrador. 

b. [Harrison wanted] [to assume that] [Brockway walked his Labrador] 



Assume (following, for instance, the analysis of Kroch and Joshi ( 1985 )) that the trees 
associated with the various forms of the verbs try, want, and assume all take sentential 
complements, certain of which are tensed with overt subjects and others untensed with 
empty subjects. The auxiliary trees for these verbs specify by adjoining constraints 
which type of sentential complement they take: assume requires tensed complements, 
want and try untensed. Under this analysis the auxiliary trees must not be allowed to 
independently adjoin at the same node. For instance, if trees corresponding to "Har- 
rison wanted" and "Brockway tried" (which both require untensed complements) were 
both adjoined at the root of the tree for "to walk his Labrador", the selective adjoin- 



ing constraints would be satisfied, yet the generated sentence (10a) is ungrammatical 



Conversely, under independent adjunction, the sentence (11a) would be deemed ungram- 
matical, although it is in fact grammatical. Thus, the case of predicative trees is entirely 
unlike that of modifier trees. Here, the standard notion of derivation is exactly what is 
needed as far as interpretation of adjoining constraints is concerned. 

An alternative would be to modify the way in which adjoining constraints are updated 
upon adjunction. If after adjoining a modifier tree at a node, the adjoining constraints 

7 We use the term 'predication' in its logical sense, that is, for auxiliary trees that serve as logical 
predicates over the trees into which they adjoin, in contrast to the term's linguistic sub-sense in which 
the argument of the predicate is a linguistic subject. 

8 The distinction betwee n pre dicative and modifier trees has been proposed previously for purely 
linguistic reasons by Kroch (11989), who refers to them as complement and athematic trees, respectively. 
The arguments presented here can be seen as providing further evidence for differentiating the two kinds 
of auxiliary trees. A precursor to this idea can perhaps be seen in the distinction between repe atahle 



and nonrcpcatablc adjunction i n the formalism of string adjunct grammars, a precursor of TAGs ( Joshi J 
[Kosaraju, and Yamada, 1972b] , pages 253—254). 
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of the original node, rather than those of the root and foot of the modifier tree, are 
manifest in the corresponding nodes in the derived tree, the adjoining constraints would 
propagate appropriately to handle the examples above. This alternative leads, however, 
to a formalism for which derivation trees are no longer context-free, with concomitant 
difficulties in designing parsing algorithms. Instead, the extended definition of derivation 
effectively allows use of a Kleene-* in the "grammar" of derivation trees. 

Adjoining constraints can also be implemented using feature structure equations 



( Vij ay- S hanker and Joshi, 1988 ). It is possible that judicious use of such techniques 
might prevent the particular problems noted here. Such an encoding of a solution 
requires consideration of constraints that pass among many trees just to limit the cooc- 
currence of a pair of trees. However, it more closely follows the spirit of TAGs to state 
such intuitively local limitations locally. 

In summary, the interpretation of adjoining constraints in TAG is sensitive to the 
particular notion of derivation that is used. Therefore, it can be used as a litmus 
test for an appropriate definition of derivation. As such, it argues for a nonstandard, 
independent, notion of derivation for modifier auxiliary trees and a standard, dependent, 
notion for predicative trees. 

3.2 Adding Statistical Parameters 

I n a similar vein, the statisti cal parameters of a stochastic lexicalized TAG (SLTAG) 



( Resnik, 1992 ; Schabes, 1992 ) specify the probability of adjunction of a given auxiliary 
tree at a specific node in another tree. This specification may again be interpreted with 
regard to differing derivations, obviously with differing impact on the resulting probabili- 
ties assigned to derivation trees. (In the extreme case, a constraint prohibiting adjoining 
corresponds to a zero probability in an SLTAG. The relation to the argument in the pre- 
vious section follows thereby.) Consider a case in which linguistic modification of noun 
phrases by adjectives is modeled by adjunction of a modifying tree. Under the standard 
definition of derivation, multiple modifications of a single NP would lead to dependent 
adjunctions in which a first modifier adjoins at the root of a second. As an example, 
we consider again the grammar given in Figure |], that admits of derivations for the 
strings "baked red potato" and "baked red pepper" . Specifying adjunction probabilities 
on standard derivations, the distinction between the overall probabilities for these two 
strings depends solely on the adjunction probabilities of (3 re (the tree for red) into a po 
and a pe (those for potato and pepper, respectively), as the tree (3b for the word baked is 
adjoined in both cases at the root of (3 re in both standard derivations. In the extended 
derivations, on the other hand, both modifying trees are adjoined independently into the 
noun trees. Thus, the overall probabilities are determined as well by the probabilities 
of adjunction of the trees for baked into the nominal trees. It seems intuitively plausi- 
ble that the most important relationships to characterize statistically are those between 
modifier and modified, rather than between two modifiers.fi In the case at hand, the 



9 Intuition is an appropriate guide in the design of the SLTAG framework, as the idea is to set up 
a linguistically plausible infrastructure on top of which a lexically-based statistical model can be built. 
In addition, suggestive (though certainly not conclusive) evidence along these lines can be gleaned from 
corpora analyses. For instance, in a simple experiment in which medium frequency triples of exactly the 
discussed form "(adjective) (adjective) (noun)" were examined, the mean mutual information between 
the first adjective and the noun was found to be larger than that between the two adjectives. The 
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fact that one typically refers to the process of cooking potatoes as "baking" , whereas the 
appropriate term for the corresponding cooking process applied to peppers is "roasting" , 
would be more determining of the expected overall probabilities. 

Note again that the distinction between modifier and predicative trees is important. 
The standard definition of derivation is entirely appropriate for adjunction probabilities 
for predicative trees, but not for modifier trees. 



3.3 Adding Semantics 

Finally, the formation of synchronous TAGs has been proposed to allow use of TAGs 
in semantic interpretation, natural language generation, and machine translation. In 
previous work ( |5hieber and Schabcs, 199C ), the definition of synchronous TAG derivation 



is given in a manner that requires multiple adjunctions at a single node. The need 
for such derivations follows from the fact that synchronous derivations are intended to 
model semantic relationships. In cases of multiple adjunction of modifier trees at a single 
node, the appropriate semantic relationships comprise separate modifications rather than 
cascaded ones, and this is reflected in the definition of synchronous TAG derivation.]^] 
Because of this, a parser for synchronous TAGs must recover, at least implicitly, the 



extended derivations of TAG derived trees. Shieber ( Forthcoming ) provides a more 
complete discussion of the relationship between synchronous TAGs and the extended 
definition of derivation with special emphasis on the ramifications for formal expressivity. 

Note that the independence of the adjunction of modifiers in the syntax does not 
imply that semantically there is no precedence or scoping relation between them. As 
exemplified in Figure ^, the derived tree generated by multiple independent adjunctions 
at a single node still manifests nesting relationships among the adjoined trees. This fact 
may be used to advantage in the semantic half of a synchronous tree- adjoining grammar 
to specify the semantic distinction between, for example, the following two sentences :p| 



(12) a. Brockway ran over his polo mallet twice intentionally. 

b. Brockway ran over his polo mallet intentionally twice. 

We hope to address this issue in greater detail in future work on synchronous tree- 
adjoining grammars. 

3.4 Desired Properties of Extended Derivations 

We have presented several arguments that the standard notion of derivation does not 
allow for an appropriate specification of dependencies to be captured. An extended 
notion of derivation is needed that 

statistical assumptions behind this particular experiment do not allow very robust conclusions to be 
drawn, and more work is needed along these lines. 

10 The importance of the distinction between predicative and modifier trees with respect to how 
derivations are defined was not appreciated in the earlier work; derivations were taken to be of the 
independent variety in all cases. In future work, we plan to remedy this flaw. 

11 We are indebted to an anonymous reviewer of an earlier version of this paper for raising this issue 
crisply through examples similar to those given here. 
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1. Differentiates predicative and modifier auxiliary trees; 

2. Requires dependent derivations for predicative trees; 

3. Allows independent derivations for modifier trees; and 

4. Unambiguously and nonredundantly specifies a derived tree. 

Furthermore, following from considerations of the role of modifier trees in a grammar as 
essentially optional and freely applicable elements, we would like the following criterion 
to hold of extended derivations: 

5. If a node can be modified at all, it can be modified any number of times, including 
zero times. 

Recall that a derivation tree (as traditionally conceived) is a tree with unordered 
arcs where each node is labeled by an elementary tree of a TAG and each arc is labeled 
by a tree address specifying a node in the parent tree. In a standard derivation tree 
no two sibling arcs can be labeled with the same address. In an extended derivation 
tree, however, the condition is relaxed: No two sibling arcs to predicative trees can be 
labeled with the same address. Thus, for any given address there can be at most one 
predicative tree and several modifier trees adjoined at that node. As we have seen, this 
relaxed definition violates the fourth desideratum above; for instance, the derivation tree 
in Figure ||(b) ambiguously specifies both derived trees in Figure g. In the next section, 
we provide a formal definition of extended derivations that satisfies all of the criteria 
above. 



4 Formal Definition of Extended Derivations 

In this section we introduce a new framework for describing TAG derivation trees that 
allows for a natural expression of both standard and extended derivations, and makes 
available even more fine-grained restrictions on derivation trees. First, we define ordered 
derivation trees and show that they unambiguously but redundantly specify deriva- 
tions.^] We characterize the redundant trees as those related by a sibling swapping op- 
eration. Derivation trees proper are then taken to be the equivalence classes of ordered 
derivation trees where the equivalence relation is generated by the sibling swapping. By 
limiting the underlying set of ordered derivation trees in various ways, Vij ay- S hanker 's 
definition of derivation tree, a precise form of the extended definition, and many other 
definitions of derivation can be characterized in this way. 



4.1 Ordered Derivation Trees 

Ordered derivation trees, like the traditional derivation trees, are trees with nodes labeled 
by elementary trees where each arc is labeled with an address in the tree for the parent 
node of the arc. However, the arcs are taken to be ordered with respect to each other. 

12 Historical precedent for independent derivation and the associated ordered derivation trees can he 



xmnd in the derivation trees postulated for string adjunct grammars (Joshi, Kosaraju, and Yamada 



L972a, pages 99-100). In this system, siblings in derivation trees are viewed as totally, not partially, 
ordered. The systematic ambiguity introduced thereby is eliminated by stipulating that the sibling 
order be consistent with an arbitrary ordering on adjunction sites. 
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An ordered derivation tree is well-formed if for each of its arcs, linking parent node 
labeled 7 to child node labeled 7' and itself labeled with address t, the tree 7' is an 
auxiliary tree that can be adjoined at the node t in the tree 7. (Alternatively, if substi- 
tution is allowed, 7' may be an initial tree that can be substituted at the node t in 7. 
Later definitions ignore this possibility, but are easily generalized.) 

We define the function T> from ordered derivation trees to the derived trees they 
specify, according to the following recursive definition: 



V{D) 



7 if D is a trivial tree of one node labeled with the elementary tree 7 

1 [V(D 1 )/t 1 ,V{D 2 )/t 2 , . . . , V{D k )/t k ] 

if D is a tree with root node labeled with the elementary tree 7 
and with k child subtrees D%, . . . , Dk 
whose arcs are labeled with addresses t\, . . . , tk- 



Here ^[Ai/ti, . . . , Ak/tk] specifies the simultaneous adjunction of trees A\ through Ak 
at t\ through tk, respectively, in 7. It is defined as the iterative adjunction of the A4 in 
order at their respective addresses, with appropriate updating of the tree addresses of 
any later adjunction to reflect the effect of earlier adjunctions that occur at addresses 
dominating the address of the later adjunction. 

4.2 Derivation Trees 

It is easy to see that the derived tree specified by a given ordered derivation tree is 
unchanged if adjacent siblings whose arcs are labeled with different tree addresses are 
swapped. (This is not true of adjacent siblings whose arcs are labeled with the same 
address.) That is, itt^t' then 7[. . . , A/t,B/t', . . .} = >y[. . . , B/t', A/t, . . .]. A graphical 
"proof" of this intuitive fact is given in Figure [|. A formal proof, although tedious and 
uncnlightcning, is possible as well. We provide it in an appendix, primarily because the 
definitional aspects of the TAG formulation may be of some interest. 

This fact about the swapping of adjacent siblings shows that ordered derivation 
trees possess an inherent redundancy. The order of adjacent sibling subtrees labeled 
with different tree addresses is immaterial. Consequently, we can define true derivation 
trees to be the equivalence classes of the base set of ordered derivation trees under the 
equivalence relation generated by the sibling subtree swapping operation above. This is 
a well-formed definition by virtue of the proposition argued informally above. 

This definition generalizes the traditional definition in no t res tricting the tree address 



labels in any way. It therefore satisfies criterion (3) of Section "iA. Furthermore, by virtue 
of the explicit quotient with respect to sibling swapping, a derivation tree under this 
definition unambiguously and nonredundantly specifies a derived tree (criterion 4). It 
does not, however, differentiate predicative from modifier trees (criterion (1)), nor can 
it therefore mandate dependent derivations for predicative trees (criterion (2)). 

This general approach can, however, be specialized to correspond to several previous 
definitions of derivation tree. For instance, if we further restrict the base set of ordered 
derivation trees so that no two siblings are labeled with the same tree address, then the 
equivalence relation over these ordered derivation trees allows for full reordering of all 
siblings. Clearly, these equivalence classes are isomorphic to the unordered trees, and 
we have reconstructed Vijay-Shanker's standard definition of derivation tree. 



11 




Figure 4: A graphical proof of the irrelevance of adjacent sibling swapping. These 
diagrams show the effect of performing two adjunctions (of auxiliary trees depicted one 
as dark-shaded and one light-shaded), presumed to be specified by adjacent siblings in 
an ordered derivation tree. The adjunctions are to occur at two addresses (referred to 
in this caption as t and t', respectively). The two addresses must be such that either 
(a) they are distinct but neither dominates the other, (b) t dominates t' (or vice versa), 
or (c) they are identical. In case (a) the diagram shows that either order of adjunction 
yields the same derived tree. Adjunction at t and then t' corresponds to the upper 
arrows, adjunction at t' and then t the lower arrows. Similarly, in case (b), adjunction 
at t followed by adjunction at an appropriately updated t' yields the same result as 
adjunction first at t' and then at t. Clearly, adjunctions occurring before these two or 
after do not affect the interchangeability. Thus, if two adjacent siblings in a derivation 
tree specify adjunctions at distinct addresses t and t', the adjunctions can occur in either 
order. Diagram (c) demonstrates that this is not the case when t and t' are the same. 
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If we instead restrict ordered derivation trees so that no two siblings corresponding 
to predicative trees are labeled with the same tree address, then we have reconstructed 
a version of the extended definition argued for in this paper. Under this restriction, 
criteria (1) and (2) are satisfied, while maintaining (3) and (4). 

By careful selection of other constraints on the base set, other linguistic restrictions 
might be imposed on derivation trees, still using the same definition of derivation trees 
as equivalence classes over ordered derivation trees. In the next section, we show that 
the definition of the previous paragraph should be further restricted to disallow the 
reordering of predicative and modifier trees. We also describe other potential linguistic 
applications of the ability to finely control the notion of derivation through the use of 
ordered derivation trees. 

4.3 Further Restrictions on Extended Derivations 

The extended definition of derivation tree given in the previous section effectively speci- 
fies the output derived tree by adding a partial ordering on sibling arcs that correspond 
to modifier trees adjoined at the same address. All other arcs are effectively unordered 
(in the sense that all relative orderings of them exist in the equivalence class) . 

Assume that in a given tree 7 at a particular address t, the k modifier trees fix, . . . , Hk 
are directly adjoined in that order. Associated with the subtrees rooted at the k elemen- 
tary auxiliary trees in this derivation are k derived auxiliary trees [A\, . . . , Ak, respec- 
tively) . The derived tree specified by this derivation tree, according to the definition of 
V given above, would have the derived tree A\ directly below A2 and so forth, with Ak 
at the top. Now suppose that in addition, a predicative tree 7r is also adjoined at address 
t. It must be ordered with respect to the /ii in the derivation tree, and its relative order 
determines where in the bottom to top order in the derived tree the tree A v associated 
with the subderivation rooted at tt goes. 

The question that we raise here is whether all k + 1 possible placements of the tree 
7r relative to the [ii are linguistically reasonable. We might allow all k + 1 orderings (as 
in the definition of the previous section), or we might restrict them by requiring, say, 
that the predicative tree always be adjoined before, or perhaps after, any modifier trees 
at a given address. We emphasize that this is a linguistic question, in the sense that 
the definition of extended derivation is well-formed whatever decision is made on this 
question. 

Henceforth, we will assume that predicative trees are always adjoined after any modi- 
fier trees at the same address, so that they appear above the modifier trees in the derived 
tree. We call this "outermost predication" because a predicative tree appears wrapped 
around the outside of the modifier trees adjoined at the same address. (See Figure pi) 
If we were to mandate innermost predication, in which a predicative tree is always ad- 
joined before the modifier trees at the same address, the predicative tree would appear 
within all of the modifier trees, innermost in the derived tree. 

Linguistically, the outermost method specifies that if both a predicative tree and a 
modifier tree are adjoined at a single node, then the predicative tree attaches higher 
than the modifier tree; in terms of the derived tree, it is as if the predicative tree were 
adjoined at the root of the modifier tree. This accords with the semantic intuition that 
in such a case (for English at least), the modifier is modifying the original tree, not the 
predicative one. (The alternate "reading" , in which the modifier modifies the predicative 
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Figure 5: Schematic extended derivation tree and associated derived tree. In a derived 
tree, the predicative tree adjoined at an address t is required to follow all modifier trees 
adjoined at the same address, as in (a). The derived tree therefore appears as depicted 
in (b) with the predicative tree outermost. 



tree, is still obtainable under an outermost-predication standard by having the modifier 
auxiliary tree adjoin dependently at the root node of the predicative tree.) In contrast, 
the innermost-predication method specifies that the modifier tree attaches higher, as if 
the modifier tree adjoined at the root of the predicative tree and was therefore modifying 
the predicative tree, contra semantic intuitions. 

For this reason, we specify that outermost predication is mandated. This is eas- 
ily done by further limiting the base set of ordered derivation trees to those in which 
predicative trees are ordered after modifier tree siblings. 

(From a technical standpoint, by the way, the outermost-predication method has the 
advantage that it requires no changes to the parsing rules to be presented later, but only 
a single addition. The innermost-predication method induces some subtle interactions 
between the original parsing rules and the additional one, necessitating a much more 
complicated set of modifications to the original algorithm. In fact, the complexities in 
generating such an algorithm constituted the precipitating factor that led us to revise 
our original, innermost-predication, attempt at redefining tree-adjoining derivation. The 
linguistic argument, although commanding, became clear to us only later.) 

Another possibility, which we mention but do not pursue here, is to allow for language- 
particular precedence constraints to restrict the possible orderings of derivation-tree sib- 
lings, in a manner similar to the linear precedence constraints of ID/LP format ( |Gazdar 
et al., 1985) but at the level of derivation trees. These might be interpreted as hard 



constraints or soft orderings depending on the application. This more fine-grained ap- 
proach to the issue of ordering has several applications. Soft orderings might be used 



14 



to account for ordering preferences among modifiers, such as the default ordering of En- 
glish adjectives that accounts for the typical preference for "a large red ball" over "? a 
red large ball" and the typical ordering of temporal before spatial adverbial phrases in 
German. 

Similarly, hard constraints might allow for the handling of an apparent counterex- 
ample to the outermost-predication rule.|^ One natural analysis of the sentence 

(13) At what time did Brockway say Harrison arrived? 

would involve adjunction of a predicative tree for the phrase "did Brockway say" at 
the root of the tree for "Harrison arrived". A Wh modifier tree "at what time" must 
be adjoined in as well. The example question is ambiguous, of course, as to whether 
it questions the time of the saying or of the arriving. In the former case, the modifier 
tree presumably adjoins at the root of the predicative tree for "did Brockway say" that 
it modifies. In the latter case, which is of primary interest here, it must adjoin at the 
root of the tree for "Harrison arrived" . Thus, both trees would be adjoined at the same 
address, and the outermost-predication rule would predict the derived sentence to be 
"Did Brockway say at what time Harrison arrived." To get around this problem, we 
might specify hard ordering constraints for English that place all Wh modifier trees after 
all predicative trees, which in turn come after all non- Wh modifier trees. This would 
place the Wh modifier outermost as required. 

Although we find this extra flexibility to be an attractive aspect of this approach, 
we stay with the more stringent outermost-predication restriction in the material that 
follows. 



5 Compilation of TAGs to Linear Indexed Grammars 

In this section, we present a technique for compiling tree-adjoining grammars into linear 
indexed grammars such that the linear-indexed grammar makes explicit the extended 
derivations of the TAG. This compilation plays two roles. First, it provides for a simple 
proof of the generative equivalence of TAGs under the standard and extended definitions 
of derivation, as described at the end of this section. Second, it can be used as the basis 
for a parsing algorithm that recovers the extended derivations for strings. The design of 
such an algorithm is the topic of Section 0. 

Linear indexed grammars (LIG) constitute a grammatical framework based, like 
context-free, context-sensitive, and unrestricted rewriting systems, on rewriting strings 
of nonterminal and terminal symbols. Unlike these systems, linear indexed grammars, 
like the indexed grammars from which they are restricted, allow stacks of marker sym- 
bols, called indices, to be associated with the nonterminal symbols being rewritten. The 
linear version of the formalism allows the full index information from the parent to be 
used to specify the index information for only one of the child constituents. Thus, a 
linear indexed production can be given schematically as: 

NolM -_ NM ■ ■JV.-iDS.-i] N a [..p a ] N S+1 [(3 S+1 ] ■ --NklPk] 

13 Other solutions are possible that do not require extended derivations or linear precedence con- 
straints. For instance, we might postulate an elementary tree for the verb arrived that includes a 
substitution node for a fronted adverbial Wh phrase. 
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Figure 6: Schematic structure of adjunction with top and bottom of each node sepa- 
rated 

The Ni are nonterminals, the strings of indices. The notation stands for the 
remainder of the stack below the given string of indices. Note that only one element 
on the right-hand side, N s , inherits the remainder of the stack from the parent. (This 
schematic rule is intended to be indicative, not definitive. We ignore issues such as the 
optionality of the i nheri ted stack, how terminal symbols fit in, and so forth. Vijay- 



indexed grammar. The LIG version makes explicit the standard notion of derivation 
being presumed. Also, the LIG version of a TAG grammar can be used for recognition 
and parsing. Because the LIG formalism is based on augmented rewriting, the parsing 
algorithms can be much simpler to understand and easier to modify, and no loss of 
generality is incurred. For these reasons, we use the technique in this work. 

The compilation process that manifests the standard definition of derivation can be 
most easily understood by viewing nodes in a TAG elementary tree as having both a 
top and bottom component, identically marked for nonterminal category, that dominate 
(but may not immediately dominate) each other. (See Figure |6|.) The rewrite rules of 
the corresponding linear indexed grammar capture the immediate domination between 
a bottom node and its child top nodes directly, and capture the domination between 
top and bottom parts of the same node by optionally allowing rewriting from the top of 
a node to an appropriate auxiliary tree, and from the foot of the auxiliary tree back to 
the bottom of the node. The index stack keeps track of the nodes that adjunction has 
occurred on so that the recognition to the left and the right of the foot node will occur 
under identical assumption of derivation structure. 

The TAG grammar is encoded as a LIG with two nonterminal symbols t and b 
corresponding to the top and bottom components, respectively, of each node. The stack 



Shanker and Weir (199C) present a complete discussion.) 




present a way of specifying any TAG as a linear 
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Figure 7: A stack of indices [7717727/3] captures the adjunction history that led to the 
reaching of the node 773 in the parsing process. Parsing of an elementary tree a proceeded 
to node r\\ in that tree, at which point adjunction of the tree containing 772 was pursued 
by the parser. When the node 772 was reached, the tree containing 773 was implicitly 
adjoined. Once this latter tree is completely parsed, the remainder of the tree containing 
772 can be parsed from that point, and so on. 

indices correspond to the individual nodes of the elementary trees of the TAG grammar. 
Thus, there are as many stack index symbols as there are nodes in the elementary trees 
of the grammar, and each such index (i.e., node) corresponds unambiguously to a single 
address in a single elementary tree. (In fact, the symbols can be thought of as pairs of 
an elementary tree identifier and an address within that tree, and our implementation 
encodes them in just that way.) The index at the top of the stack corresponds to the 
node being rewritten. Thus, a LIG nonterminal with stack t[rj\ corresponds to the top 
component of node 77, and 6 [771 772773] corresponds to the bottom component of 773. The 
indices 771 and 772 capture the history of adjunctions that are pending completion of the 
tree in which 773 is a node. Figure depicts the interpretation of a stack of indices. 
In summary, given a tree-adjoining grammar, the following LIG rules are generated: 

1. Immediate domination dominating foot: For each auxiliary tree node 77 that domi- 
nates the foot node, with children 771, . . . , r) s , . . . , T) n , where 77,, is the child that also 
dominates the foot node, include a production 

b[..rf\ -> t[r)i] ■ ■ ■ t[rj s -i]t[..ri a ]t[ri s+ i] ■ ■ ■ t[rj n ] 

2. Immediate domination not including foot: For each elementary tree node 77 that 
does not dominate a foot node, with children rji, ... , r] n , include a production 

b[n] -tM---t[7M] 

3. No adjunction: For each elementary tree node 77 that is not marked for substitution 
or obligatory adjunction, include a production 

t[-v] -» b[..r]] 

4. Start root of adjunction: For each elementary tree node 77 on which the auxiliary 
tree /3 with root node r\ r can be adjoined, include the following production: 
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t[..rj\ -> t[..rp) r ] 

5. Start foot of adjunction: For each elementary tree node n on which the auxiliary 
tree [3 with foot node rjf can be adjoined, include the following production: 

b[-VVf] b[-v\ 

6. Start substitution: For each elementary tree node r/ marked for substitution on 
which the initial tree a with root node r/ r can be substituted, include the produc- 
tion 

t[rj\ -> t[T} r ] 

We will refer to productions generated by Rule i above as Type i productions. For 
example, Type 3 productions are of the form £[..77] — > b[..rj\. For further information 



concerning the compilation see the work of Vij ay- S hanker and Weir (1990). For present 
purposes, it is sufficient to note that the method directly embeds the standard notion of 
derivation in the rewriting process. To perform an adjunction, we move (by Rule 4) from 
the node adjoined at to the top of the root of the auxiliary tree. At the root, additional 
adjunctions might be performed. When returning from the foot of the auxiliary tree 
back to the node where adjunction occurred, rewriting continues at the bottom of the 
node (see Rule 5), not the top, so that no more adjunctions can be started at that node. 
Thus, the dependent nature of predicative adjunction is enforced because only a single 
adjunction can occur at any given node. 

In order to permit extended derivations, we must allow for multiple modifier tree 
adjunctions at a single node. There are two natural ways this might be accomplished, 
as depicted in Figure ||. 

1. Modified start foot of adjunction rule: Allow moving from the bottom of the foot 
of a modifier auxiliary tree to the top (rather than the bottom) of the node at 
which it adjoined (Figure 

2. Modified start root of adjunction rule: Allow moving from the bottom (rather than 
the top) of a node to the top of the root of a modifier auxiliary tree (Figure ||c) . 

As can be seen from the figures, both of these methods allow recursion at a node, unlike 
the original method depicted in Figure ||a. Thus multiple modifier trees are allowed to 
adjoin at a single node. Note that since predicative trees fall under the original rules, at 
most a single predicative tree can be adjoined at a node. The two methods correspond 



exactly to the innermost- and outermost-predication methods discussed in Section 4.3. 
For the reasons described there, the latter is preferred. ^ 

In summary, independent derivation structures can be allowed for modifier auxiliary 
trees by starting the adjunction process from the bottom, rather than the top of a node 
for those trees. Thus, we split Type 4 LIG productions into two subtypes for predicative 
and modifier trees, respectively. 

14 The more general definition allowing predicative trees to occur anywhere within a sequence of 
modifier adjunctions would be achieved by adding both types of rules. 
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4a. Start root of predicative adjunction: For each elementary tree node i] on which 
the predicative auxiliary tree (3 with root node r\ r can be adjoined, include the 
following production: 



t[..rj\ -> t[..Tfq r ] 

4b. Start root of modifier adjunction: For each elementary tree node 77 on which the 
modifier auxiliary tree (3 with root node r\ r can be adjoined, include the following 
production: 

b[..rj\ -> t[..T]T] r ] 

Once this augmentation has been made, we no longer need to allow for adjunctions at 
the root nodes of modifier auxiliary trees, as repeated adjunction is now allowed for by 
the new rule 4b. Consequently, grammars should forbid adjunction of a modifier tree /3i 
at the root of a modifier tree P2 except where j3i is intended to modify 02 directly. 

This simple modification to the compilation process from TAG to LIG fully specifies 
the modified notion of derivation. Note that the extra criterion (5) noted in Section 3.4 is 
satisfied by this definition: Modifier adjunctions are inherently repeatable and eliminable 
as the movement through the adjunction "loop" ends up at the same point that it 
begins. The recognition algorithms for TAG based on this compilation, however, must 
be adjusted to allow for the new rule types. 

This compilation makes possible a simple proof of the weak-generative equivalence 
of TAGs under the standard and extended derivations .0 Call the set of languages 
generable by a TAG under the standard definition of derivation TAL S and under the 
extended definition TAL e . Clearly, TAL S C TAL e since the standard definition can 
be mimicked by making all auxiliary trees predicative. The compilation above provides 
the inclusion TAL e C LIL, where LIL is the set of linear indexed lang uages . The 



final inclusion LIL C TAL S has been shown indirectly by Vijay-Shanker ( 1987 ) using 
embedded push-down automata and modified head grammars as intermediaries. From 
these inclusions, we can conclude that TAL S = TAL e . 



6 Recognition and Parsing 

A recognition algorithm for TAGs can be constructed based on the above translation 
into corresponding LIGs as specified by Rules 1 through 6 in the previous section. The 
algorithm is not a full recognition algorithm for LIGs, but rather, is tuned for exactly 
the types of rules generated as output of this compilation process. In this section, we 
present the recognition algorithm and modify it to work with the extended derivation 
compilation. 

We will use the following notations in this and later sections. The symbol P will 
serve as a variable over the two LIG grammar nonterminals t and b. The substring of 
the string wi ■ ■ ■ w n being parsed between indices i and j will be notated as ifi+i • • • Wj, 
which we take to be the empty string when i is greater than or equal to j. We will use 

15 We are grateful to K. Vijay-Shanker for bringing this point to our attention. 
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r, A, and for sequences containing terminals and LIG nonterminals with their stack 
specifications. For instance, T might be £[»7:i]t[..J72]i[»73]. 

The parsing algorithm can be seen as a tabular parsing method based on deduction 
of items, as in Earley deduction ( Pereira and Warren, 1983| ). We will so describe it, by 



presenting inference rules over items of the form 

{P[tj\^T»A,i,j,k,l) 

Such items play the role of the items of Earley's algorithm. Unlike the items of Earley's 
algorithm, however, an item of this form does not embed a grammar rule proper, that 
is P[rj] — ► TA is not necessarily a rule of the grammar. Rather, it is what we will 
call a reduced rule; for reasons described below, the nonterminals in T and A as well 
as the nonterminal P[rj\ record only the top element of each stack of indices. We will 
use the notation P[rj\ — > TA for the unreduced form of the rule whose reduced form is 
P[rj] — » TA. For instance, the rule specified by the notation t[rji] — » £[772] might be the 
rule £[..771] — > £[..771772] • The reader can easily verify that the TAG to LIG compilation 
is such that there is a one-to-one correspondence between the generated rules and their 
reduced form. Consequently, this notation is well-defined. 

The dot in the items is analogous to that found in Earley and LR items as well. 
It serves as a marker for how far recognition has proceeded in identifying the subcon- 
stituents for this rule. The indices i, j, k, and I specify the portion of the string w\ ■ ■ ■ w n 
covered by the recognition of the item. The substring between i and I (i.e., lUj+i • • -wi) 
has been recognized, perhaps with a region between j and k where the foot of the tree 
below the node rj has been recognized. (If the foot node is not dominated by T, we take 
the values of j and k to be the dummy value ' — '.) 

6.1 The Inference Rules 

In this section, we specify several inference rules for parsing a LIG generated from a 
TAG, which we recall in this section. One explanatory comment is in order, however, 
before the rules are presented. The rules of a LIG associate with each constituent a 
nonterminal and a stack of indices. It seems natural for a parsing algorithm to maintain 
this association by building items that specify for each constituent the full information 
of nonterminal and index stack. However, this would necessitate storing an unbounded 
amount of information for each potential constituent, resulting in a parsing algorithm 
that is potentially quite inefficient when nondeterminism arises during the parsing pro- 
cess, and perhaps non-effective if the grammar is infinitely ambiguous. Instead, the 
parse items manipulated by the inference rules that we present do not keep all of this 
information for each constituent. Rather, the items keep only the single top stack el- 
ement for each constituent (in addition to the nonterminal symbol). This drastically 
decreases the number of possible items, and accounts for the polynomial character of 
the resultant algorithm.^ Side conditions make up for some of the loss of information, 
thereby maintaining correctness. For instance, the Type 4 Completor rule specifies a 



16 Vijay-Shanker and Weir ( 1990 ) first proposed the recording of only the top stack element in order to 



achieve e fficien t parsing. The algorithm they presented is a bottom-up general LIG parsing algorithm. 



Schabes (1991) sketches a proof of an 0(n ) bound for an Earley-style algorithm for TAG parsing that 



is more closely related to the algorithm proposed here. 
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relation between 77 and 77/ that takes the place of popping an element off of the stack 
associated with 7/. However, the side conditions are strictly weaker than maintaining 
full stack information. Consequently, the algorithm, though correct, does not maintain 



the valid prefix property. See the work of Schabes (1991) for further discussion and 
alternatives. 

Scanning and prediction work much as in Earley's original algorithm. 
• Scanner: 



(b[i]} -> f«aA, k,l) 
(%] -^Ta»A,i,j,k,l + l) 



a = wi+i 



Note that the only rules that need be considered are those where the parent is a 
bottom node, as terminal symbols occur on the right-hand side only of Type 1 or 
2 productions. Otherwise, the rule is exactly as that for Earley's algorithm except 
that the extra foot indices (j and k) are carried along. 



Predictor: 



(P[ v }^T.P'[ v '}A,i,j,k,l) 



111 P'fn'l —, © 



This rule serves to form predictions for any type production in the grammar, as 
the variables P and P' range over the values t and b. In the predicted item, the 
foot is not dominated by the (empty) recognized input, so that the dummy value 
' — 'is used for the foot indices. Note that the predicted item records the reduced 
form of an unreduced rule P'[t}'] —> O of the grammar. 

Completion of items (moving of the dot from left to right over a nonterminal) breaks 
up into several cases, depending on which production type is being completed. This 
is because the addition of the extra indices and the separate interpretations for top 
and bottom productions require differing index manipulations to be performed. We 
will list the various steps, organized by what type of production they participate in the 
completion of. 

Productions that specify immediate domination (from Rules 1 and 2) are completed 
whenever the top of the child node is fully recognized. 

• Type 1 and 2 Completor: 

(%i]^r.<MA,m,/,fc',i) (t[rj\ ^e*,i,j,k,l) 
(b[r)i)^Tt[r)]*A,m,jUj',kUk',l) 

Here, t[rf\ has been fully recognized as the substring between i and I. The item 
expecting t[rf\ can be completed. One of the two antecedent items might also 
dominate the foot node of the tree to which rj and 7/1 belong, and would therefore 
have indices for the foot substring. The operations j U j' and k U k' are used to 
specify whichever of j or j' (and respectively for k or k') contain foot substring 
indices. The formal definition of U is as follows: 
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3 if/ = - 

i' if .7 = 

j if / = 3 

undefined otherwise 



The remaining rules (3 through 6) are each completed by a particular completion 
instance. 

• Type 3 Completer: 

{t[ri]-> •b[rj\,i,-,-,i) (%] ->8»,i,j, k, I) 
(t[rj\ -> %] »,i,j,k,l) 

This rule is used to complete a prediction that no (predicative) adjunction occurs 
at node 77. Once the part of the string dominated by b[rf\ has been found, as 
evidenced by the second antecedent item, the prediction of no adjunction can be 
completed. 

• Type 4 Gompletor: 

{t[rj\ -> •t[rj r ],i,-,-,i) 

<*for]-©»,M,M> tfnl-tfrml 

(b[r,]^A.,j,p,q,k) ™ ^ t[ ~ m r\ 



(t[rj\ -> t[?7 r ] »,i,p,q,l) 

Here, an adjunction has been predicted at ij, and the adjoined derived tree (be- 
tween t[rf\ and b[rf\) and the derived material that r\ itself dominates (below b[rj\) 
have both been completed. Thus t\rj\ is completely recognized. Note that the side 
condition (the unreduced form of the reduced rule in the first antecedent item) is 
placed merely to guarantee that rj r is the root node of an adjoinable auxiliary tree. 

• Type 5 Completor: 

(%/] %] •lhhl,l) 

When adjunction has been performed, and recognition up to the foot node r\f 
has been performed, it is necessary to recognize all the material under the foot 
node. When that is done, the foot node prediction can be completed. Note that it 
must be possible to have adjoined the auxiliary tree at node r\ as specified in the 
production in the side condition. 

• Type 6 Completor: 

(t[7?] -> •t[7 ?r ],i,-,-,j) (tM->9t,i,-,-,l) r , r , 

<tfo]-tfor]., »,-,-,/> ^ ^ 

Completion of the material below the root node r\ r of an initial tree allows for the 
completion of the node at which substitution occurred. 
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The recognition process for a string wi • ■ • w n starts with some items that serve 
as axioms for these inference rules. For each rule t[r) s ] — > T where i] s is the root 
node of an initial tree which node is labeled with the start nonterminal, the item 
(t[r] s ] — > • r, 0, — ,— , 0) is an axiom. If from these axioms an item of the form 
(t[r] s ] — ► F • , 0, — , — , n) can be proved according to the rules of inference above, the 
string is accepted; otherwise it is rejected. 

Alternatively, the axioms can be stated as if there were extra rules S — > t[rj s ] for 
each r) s a start-nonterminal-labeled root node of an initial tree. In this case, the axioms 
are items of the form (S — ► • t[r] s ], 0, — , — , 0) and the string is accepted upon proving 
(S — > t[rj s ] • , 0, — , — , n). In this case, an extra prediction and completion rule is needed 
just for these rules, since the norm al ru les do not allow S on the left-hand side. This 
point is taken up further in Section 6.4. 

Generation of items can be cached in the standard way for inference-based parsing 
algorithms (Shiebcr, 1992); this leads to a tabular or chart-based parsing algorithm. 



6.2 The Algorithm Invariant 

The algorithm maintains an invariant that holds of all items added to the chart. We 
will describe the invariant using some additional notational conventions. Recall that 
P[rj\ — ► r is the LIG production in the grammar whose reduced form is P[t]\ — > T. The 
notation T[y] where 7 is a sequence of stack symbols (i.e., nodes), specifies the sequence 
r with 7 replacing the occurrence of .. in the stack specifications. For example, if T is 
the sequence * [i7i]*[- [^3] , then r[7] = t [r/i } t [7772 ]t [773]. A single LIG derivation step 
will be notated with => and its reflexive transitive closure with 

The invariant specifies that (P[i]} — > T • A, i,j, k, I) is in the chart only i f^| 

1. If node rj dominates the foot node 77/ of the tree to which it belongs, then there 
exists a string of stack symbols (i.e., nodes) 7 such that 

(a) P[rj] — > TA is a LIG rule in the grammar, where T is the unreduced form of 

r. 

(b) r[7??] ^* w i+ i ■ ■ ■ Wjb[yr)f]wk+i ■■ -w t 

(c) b[yj]f] Wj+i ■ ■ ■ w k 

2. If node 77 does not dominate the foot node 77/ of the tree to which it belongs or 
there is no foot node in the tree, then 

(a) P[rj\ — > TA is a LIG rule in the grammar, where T is the unreduced form of 

r. 



17 The invariant is not stated as a biconditional because this would require strengthening of the 
antecedent condition. The natural strengthening, following the standard for Earley's algorithm, would 
be to add a requirement that the item be consistent with left context, as 

(d) 77a =>* lui ■ ■ ■ WiP{~jrj\ 

but this is too strong. This condition implies that the algorithm possesses the valid prefix property, 
which it does not. The exact statement of the invariant condition that would allow for exact speci- 
fications of the item semantics is the topic of ongoing research. However, the current specification is 
sufficient for proving soundness of the algorithm. 
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(b) r Wi+i ■ ■ -wi 

(c) j and k are not bound. 

According to this invariant, for a node rj s which is the root of an initial tree, the item 
(t[r]s] — > F • , 0, — , — , n) is in the chart only if t[rj s ] =$> T =>* w\ ■ ■ -w n . Thus, soundness 
of the algorithm as a recognizer follows. 



6.3 Modifications for Extended Derivations 

Extending the algorithm to allow for the new types of production (specifically, as derived 
by Rule 4b) requires adding a completion rule for Type 4b productions. For the new 
type of production, a completion rule of the following form is required: 

• Type 4b Completer: 



(b[rj\ -> •t[r) r ],i,-,-,i) 

(t[rj r ] -> &• ,i,j,k,l) 

(b[i]} -> t[r) r ] • ,i,p,q,l) 



In addition to being able to complete Type 4b items, we must also be able to complete 
other items using completed Type 4b items. This is an issue in particular for completor 
rules that might move their dot over a b[rj\ constituent, in particular, the Type 3 and 
5 Completors. However, these rules have been stated so that the antecedent item with 
right-hand side b[rf\ already matches Type 4b items. Furthermore, the general statement, 
including index manipulation is still appropriate in the context of Type 4b productions. 
Thus, no further changes to the recognition inference rules are needed for this purpose. 

However, a bit of care must be taken in the interpretation of the Type 1 /2 Completor. 
Type 4b items that require completion bear a superficial resemblance to Type 1 and 2 
items, in that both have a constituent of the form f [_] after the dot. In Type 4b items, the 
constituent is t[r] r ], in Type 4a items t\rj\. But it is crucial that the Type 1/2 Completor 
not be used to complete Type 4b items. A simple distinguishing characteristic is that 
in Type 1 and 2 items to be completed, the node rj after the dot is never a root node 
(as it is immediately dominated by 771) , whereas in Type 4b items, the node rj r after the 
dot is always a root node (of a modifier tree). Simple side conditions can distinguish 
the cases. 

Figure |j] contains the final versions of the inference rules for recognition of LIGs 
corresponding to extended TAG derivations. 



6.4 Maintaining Derivation Structures 

One of the intended applications for extended derivation TAG parsing is the parsing of 
synchronous TAGs. Especially important in this application is the ability to generate 
the derivation trees while parsing proceeds. 

A synchronous TAG is composed of two base TAGs (which we will call the source 
and target TAG) whose elementary trees have been paired one-to-one. A synchronous 
TAG whose source TAG is a grammar for a fragment of English, and whose target TAG 



25 



Scanner: 

(%] ^r»aA,z,j,fc,Q 



(b[ V ]^Ta.A,i,j,k,l + l) a Wl+1 



Predictor: 

(p[ v }^r.p'[ v ']A,i,j,k,i) 
(p'W] =J .e.i.-.-o 

Type i and £ Completor: 



P'[r)'\ -> e 



■^-^ ., f , ' r n . . ./ , , 7 , n not a root node 

Type 3 Completor: 

{t[rj\ -> (%] -> 9 • ,i, j,fc,Q 

(<W -» %] • ,i,j,k,l) 

Type 4-Q. Completor: 





(tfor] -> e • , 

(%] 


<> 

i,j,k,l) 




(t[rj\ - t[ljr] • 


,i,p,q,i) 


Type 4b Completor. 






<%] 


-> •i[»7 r ],i ! -, -, 

<t[fjr] 

<%] 


i) 

i,j,k,l) 

-> A • ,j,p,q,k) 




<%] - i[r? r ] . 


,i,p,q,i) 


Type 5 Completor: 






<%/] -> 


• b[rj\,i, -, -,i) 


{b[rj\ ->e»,i,j,k, 




(%/] -> %] • 


, i, i, /, /} 



Type 6 Completor: 

(t[rj\ -> •%],», -,-,i> <t[r?r] -> e»,i,-,-,Z) 



*[»?] -» *[»7r] 



Figure 9: Inference Rules for Extended Derivation TAG Recognition 
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is a grammar for a logical form language may be used to generate logical forms for 
each sentence of English that the source grammar admits (Shieber and Schabes, 1990). 
Similarly, with source and target swapped, the synchronized grammar may be used to 
generate English sentences corresponding to logical forms (Shieber and Schabes, 1991). 
If the source and target gramm ars specify fragments of natural lan guages, an automatic 
translation system is specified ( Abeille, Schabes, and Joshi, 199(]| ). 

Abstractly viewed, the processing of a synchronous grammar proceeds by parsing an 
input string according to the source grammar, thereby generating a derivation tree for 
the string; mapping the derivation tree into a derivation tree for the target grammar; 
and generating a derived tree (hence, derived string) according to the target grammar. 

One frequent worry about synchronous TAGs as used in their semantic interpreta- 
tion mode is whether it is possible to perform incremental interpretation. The abstract 
view of processing just presented seems to require that a full derivation tree be de- 
veloped before interpretation into the logical form language can proceed. Incremental 
interpretation, on the other hand, would allow partial interpretation results to guide the 
parsing process on-line, thereby decreasing the nondetcrminism in the parsing process. 
Whether incremental interpretation is possible depends precisely on the extent to which 
the three abstract phases of synchronous TAG processing can in fact be interleaved. 
In previous work, we left this issue open. In this section, we allay these worries by 
showing how the extended TAG parser just presented can build derivation trees incre- 
mentally as parsing proceeds. Once this has been demonstrated, it should be obvious 
that these derivation trees could be transferred to target derivation trees during the 
parsing process, and immediately generated from. Thus, incremental interpretation is 
demonstrated to be possible in the synchronous TAG framework. In fact, the technique 
presented in this section has allowed for the first implementation of synchronous TAG 
processing, due to Onnig Dombalagian. This implementation was directly based on the 
i nference-based TAG parser mentioned in Section 3.E and presented in full elsewhere 
( |5chabcs and Shieber, 1992j ). 

We associate with each item a set of operations that have been implicitly carried out 
by the parser in recognizing the substring covered by the item. An operation can be 
characterized by a derivation tree and a tree address at which the derivation tree is to 
be placed; it corresponds roughly to a branch of a derivation tree. Prediction items have 
the empty set of operations. Type 4 and 6 completion steps build new elements of the 
sets as they correspond to actually carrying out adjunction and substitution operations, 
respectively. Other completion steps merely pool the operations from their constituent 
parts. 

In describing the building of derivation trees, we will use normal set notation for 
the sets of derivation trees. We will assume that for each node rj, there are functions 
tree(rf) and addr(ri) that specify, respectively, the initial tree that rj occurs in and its 
address in that tree. Finally, we will use a constructor function for derivation trees 
deriv("f, S), where 7 specifies an elementary tree and S specifies a set of operations on 
it. An operation is built with op(t, D) where t is a tree address and D is a derivation 
tree to be operated at that address. 

Figure ^ lists the previously presented recognition rules augmented to build deriva- 
tion structures as the final component of each item. The axioms for this inference system 
are items of the form (S — > • t[r) 8 ], 0, — , — , 0, {}), where we assume as in Section^ that 
there are extra rules S — > t[rj s ] for each rj s a start-nonterminal-labeled root node of an 
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initial tree. We require an extra rule for prediction and completion to handle this new 
type of rule. The predictor rule is the obvious analog: 

• Start Rule Predictor: 

(S^r»P'[rf\A,%,j,k,l,S) 



(P'[rf] -> .9,l,-,-,l,{}) 



p'[r]'] -> e 



In fact, the existing predictor rule could have been easily generalized to handle this case. 

The completor for these start rules is the obvious analog to a Type 6 completer, 
except in the handling of the derivation. It delivers, instead of a set of derivation 
operations, a single derivation tree. 

• Start Rule Completor: 

(s -» . t[T} s ],i, -, -, i, {}) (t[ Vs ] -> e . , i, -, -, i, s) 

{S — > t[rj s ] -, I, deriv(tree(rj s ), S)) 

The string is accepted upon proving (S — > t[rj s ] • , } — , — , n, D), where D is the 
derivation developed during the parse. 

6.5 Complexity Considerations 



The inference system of Section 6.3 essentially specifies a parsing algorithm with com- 
plexity of 0(n 6 ) in the length of the string. Adding explicit derivation structures to 
the items, as in the inference system of the previous section eliminates the polynomial 
character of the algorithm, in that there may be an unbounded number of derivations 
corresponding to any given item of the original sort. Even for finitely ambiguous gram- 
mars, the number of derivations may be exponential. Nonetheless, this fact docs not 
vitiate the usefulness of the second algorithm, which maintains derivations explicitly. 
The point of this augmentation is to allow for incremental interpretation — for inter- 
leaved processing of a post-syntactic sort — so as to guide the parsing process in making 
choices on-line. By using the extra derivation information, the parser should be able to 
eliminate certain nondeterministic paths of computation; otherwise, there is no reason 
to do the interpretation incrementally. But this determinization of choice presumably 
decreases the complexity. Thus, the extra information is designed for use in cases where 
the full search space is not intended to be explored. 

Of course, a polynomial shared-forest representation of the exponential number of 
derivations could have been maintained (by maintaining back pointers among the items 
in the standard fashion). For performing incremental interpretation for the purpose 
of determinization of parsing, however, the non-shared representation is sufficient, and 
preferable on grounds of ease of implementation and expository convenience. 

As a proof of concept, the parsing algorithm just described was implemented in Pro- 
log on top of a simple, general-purpose, agenda-based inference engine. Encodings of 
explicit inference rules are essentially interpreted by the inference engine. The Prolog 
database is used as the chart; items not already subsumed by a previously generated 
item are asserted to the database as the parser runs. An agenda is maintained of poten- 
tial new items. Items are added to the agenda as inference rules are triggered by items 
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Scanner: 

(b[n]^r.aA,i,j,k,l,S) 
(b[r,}^ra.A,i,j,k,l + l,S) a Wl+1 

Predictor: 

(P[r ) ]-,T.P'[r/]A,i,j,k,l,S) , , 

(PVl- «e, /,-,-,/,{}> 

Type i and 2 Completer: 

(fc[ryi] -> n[r/] • A,m,j Uj,kUk',l,Si US 2 ) ' 
Type 3 Completor: 

(%]->0.,t,j,fc,f,5) 
<*[»?] -» %] •,i,j,k,l,S) 

Type 4a Completor: 

(t[rj\ -> •t[r? r .],i, -,-,»,{}) 

(%] ^ A » ,j,p,q,k,S 2 ) L /J 1 J 

(*[»?] ~ > t[»7r] • g, i, {op(addr(rj),deriv(tree(rj r ), Si))} U 5 2 ) 

Type Completor: 

{b[r]} -» •*[»Jr],i,-,-,i,{}> 

(t[»j] — ■> t[r? r ] • ,i,p,q,l, {op(addr(Tj) , deriv(tree(ri r ) , Si))} U 52 } 
Type 5 Completor: 

{b[ri f ]-> •b[r 1 ],i,-,-,i,{}) (%] -> <d» ,i,j,k,l,S) , , r , 

Type 6 Completor: 

(t[rj\ -»■ •t[?7r],i, {}> (*[»7r] -*8t,i,-,-,i,S) 



(tfo] - t[tjr] -, I, {op(addr( V ),deriv(tree( Vr ), 5))}) * W 

Figure 10: Inference Rules for Extended Derivation TAG Parsing 
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added to the chart. Because the inference rules are stated explicitly, the relation between 
the abstract inference rules described in this paper and the implementation is extremely 
transparent. As a meta-interpreter, the prototype is not particularly efficient. (In par- 
ticular, the implementation does not achieve the theoretical 0(n 6 ) bound on complexity, 
because of a lack of appropriate indexing.) Code for the prototype implementation is 
available for distribution electronically from the authors. 

7 Conclusion 

The precise formulation of derivation for tree-adjoining grammars has important rami- 
fications for a wide variety of uses of the formalism, from syntactic analysis to semantic 
interpretation and statistical language modeling. We have argued that the definition of 
tree-adjoining derivation must be reformulated in order to take greatest advantage of 
the decoupling of derivation tree and derived tree by manifesting the proper linguistic 
dependencies in derivations. The particular proposal is both precisely characterizable 
through a definition of TAG derivations as equivalence classes of ordered derivation 
trees, and computationally operational, by virtue of a compilation to linear indexed 
grammars together with an efficient algorithm for recognition and parsing according to 
the compiled grammar. 
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A Proof of Redundancy of Adjacent Sibling Swap- 
ping 

A.l Preliminaries 
A. 1.1 Tree Addresses 

We define tree addresses (variables over which are conventionally notated p,q, . . . ,t,u,v 
and their subscripted and primed variants) as the finite, possibly empty, sequences of 
positive integers (conventionally i , j, k) , with _ • _ as the sequence concatenation operator. 
We uniformly abuse notation by conflating the distinction between singleton sequences 
and their one element. 

We use the notation p -< q to notate that tree address p is a proper prefix of q, and 
P d q for improper prefix. When p ■< q, we write q — p for the (possibly empty) sequence 
obtained from q by removing p from the front, e.g. 1 • 2 • 3 • 4 — 1-2 = 3-4. 

A.l. 2 Trees 

We will take trees (conventionally A, B, E, T; also a, f3, 7 in the prior text) to be finite 
partial functions from tree addresses to symbols, such that the functions are 

Prefix closed: For any tree T, if T(p ■ i) is defined then T{p) is defined. 

Left closed: For any tree T, if T(p ■ i) is defined and i > 1 then T(p ■ (i — 1)) is defined. 

We will refer to the domain of a tree T, the tree addresses for which T is defined, as 
the nodes of T. A node p of T is a frontier node if T{p ■ i) is undefined for all i. A node 
of T is an interior node if it is not a frontier node. We say that a node p of T is labeled 
with a symbol s if T(p) = s. 

A. 2 Tree- Adjoining Grammars and Derivations 
A. 2.1 Tree- Adjoining Grammars 

In the following definitions, we restrict attention to tree-adjoining grammars in which 
adjunction is the only operation; substitution is not allowed. The definitions are, how- 
ever, easily augmented to include substitution. We define a tree-adjoining grammar to 
be given by a quintuple (S, N, I, A, S) where 

• E is a finite set of terminal symbols. 

• N is a finite set of nonterminal symbols disjoint from S. 

• (V = £ U N is the vocabulary of the grammar.) 

• S is a distinguished nonterminal symbol, the start symbol. 

• 1 is a finite set of trees, the initial trees, where 

— interior nodes are labeled by nonterminal symbols, and 
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— frontier nodes are labeled by terminal symbols or the special symbol e. (We 
require that e ^ V, as e intuitively specifies the empty string.) 

• A is a finite set of trees, the auxiliary trees, where 

— interior nodes are labeled by nonterminal symbols, and 

— frontier nodes are labeled by terminal symbols or e, except for one node, 
called the foot node, which is labeled with a nonterminal symbol. 

• (£ = X U A is the set of elementary trees of the grammar.) 

By convention, the address of the foot node of a tree A is notated f A . 

A. 2. 2 Adjunction 

The adjunction of an auxiliary tree A at address t in tree E notated is defined 

to be the smallest (least defined) tree T such that 

( E(r) if t^r (1) 
T(r) = \ A{u) if r = t ■ u and f A -fi u (2) 
[ E(t-u) if r = t-f A -u (3) 

These cases are disjoint except at addresses t and t ■ Ja- We have 

T(t) - E(t) 

by clause (1), and 

T(t) = A{t) 

by clause (2). Similarly, we have 

T(t ■ f A ) = A(f A ) 

by clause (2) and 

T(t ■ f A ) - E(t) 

by clause (3). So for an adjunction to be well defined, it must be the case that 

E(t) - A(t) = A(f A ) 

that is, the node at which adjunction occurs must have the same label as the root and 
foot of the auxiliary tree adjoined. This is, of course, standard in definitions of TAG. 

Alternatively, this constraint can be added as a stipulation and the definition modi- 
fied as follows: 

C E(r) ift^r 
T(r) = I A(u) if r = t ■ u and f A ^ u 
{ E(t ■ u) if r = t-f A -u 

We will use this latter definition below. 
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A. 2. 3 Ordered Derivation Trees 



Ordered derivation trees are ordered trees composed of nodes, conventionally notated as 
77, possibly in its subscripted and primed variants. (For ordered derivation trees, we will 
be less formal as to their mathematical structure. In particular, the formalization of the 
previous section need not apply; the definitions that follow define all of the structure 
that we will need.) The parent of a node 77 in a derivation tree will be written parent(n), 
and the tree in E that the node marks adjunction of will be notated tree(rj). The tree 
tree[rj) is to be adjoined into its parent tree {par ent(rj)) at an address specified on the 
arc in the tree linking the two; this address is notated addr(n). (Of course the root node 
has no parent or address; the parent and addr functions are partial.) 

An ordered derivation tree is well-formed if for each arc in the derivation tree from 
r\ to parent(rj) labeled with addr{q), the tree tree{rf) is an auxiliary tree that can be 
adjoined at the node addr(r/) in tree(parent(n)) . 



We repeat from Section 4.1 the definition of the function V from derivation trees to 



the derived trees they specify, in the notation of this appendix: 



V(D) = 



tree(n) if D is a trivial tree of one node 77 

tree{r})[V{D 1 )/t 1 ,V(D 2 )/t 2 , . . .,V(D k )/t k ] 
if D is a tree with root node 77 

and with k child subtrees D\,...,D k 

whose arcs are labeled with addresses t\,...,t k . 



As in Section 4.1, E[A\/t\, . . . ,Ak/tk] specifies the simultaneous adjunction of trees 
Ax through A k at t\ through t k , respectively, in E. It is defined as the iterative adjunc- 
tion of the Ai in order at their respective addresses, with appropriate updating of the 
tree addresses of later adjunctions to reflect the effect of earlier adjunctions. In partic- 
ular, the following inductive definition suffices; the base case holds for the adjunction of 
zero auxiliary trees. 

E[] = E 

E[Ai/ti, A 2 /t 2 , . . . , A k /t k ] 

= (E[Ax/ti])[A 2 /update(t 2 , A u ti), A k /update{t k , A u ti)\ 



where 

update(s, A, t) = 



S lft-^S 

t ■ f A ■ (s - t) if t -< s 



In the following section, we leave out parentheses in specifying sequential adjunctions 
such as (E[Ai / ti])[A 2 / 1 2 ) under a convention of left associativity of the [_/_] operator. 

A.3 Effect of Sibling Swaps 

In this section, we show that the derived tree specified by a given ordered derivation tree 
is unchanged if adjacent siblings whose arcs are labeled with different tree addresses are 
swapped. This will be shown as the following proposition. 

Proposition 1 Ift^f then E[. . .,A/t,B/t\ ...)=E[.. .,B/t',A/t, ...}. 
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We start with a lemma, the case for only two adjunctions. 
Lemma 1 If t ^ t' then E[A/t, B/t'] = E[B/t',A/t]. 

Proof: There are three major cases, depending on the relationship of t and t': 
Case t<t': Let s = t' - t. Then 



E[A/t,B/t'](r) 



E[A/t] [B /update (t',A,t)](r) 
E[A/t][B/t-f A -s](r) 





f E[A/A(r) 




if t ■ f A ■ s 2< r 






B(u) 




if r = t ■ f A ■ s ■ 


u and f B ^u 




E[A/t](t-f A 


■ s ■ u) 


if r = t ■ f A ■ s ■ 


Jb ■ u 




E(r) 


if t-f A 


■ s r and t ^ r 






A(v) 


if t-f A 


■ s ^ r and r = t 


■ V 


= < 


E(t ■ v) 


if t-f A 


■ s ^ r and r = t 


■ /a • v 




B(u) 


if r = t 


f A - s -u and f B 


2< U 




E(t-s- u) 


if r = t 


Ja- s- Jb-u 






' E(r) 


iit^r 








A(v) 


if r = t 


V 




= < 


E(t ■ v) 


if s £ V 


and r = t ■ f A ■ v 






B(u) 


if r = t 


f A - s - u and f B 


2< u 




E(t-s-u) 


if r = t 


fA- s- Jb-u 





If siblings are swapped, 
E[B/t',A/t](r) 



= E[B/t'] [A/update(t, B, t')](r) 
= E[B/t'][A/t]{r) 

E[B/t-s]{r) ift^r 
A(v) if r = t ■ v and f A ^v 

E[B/t ■ s](t ■ v) if r = t-f A -v 
E(r) if t r 

A(v) if r = t ■ v 

E(t -v) if r = t ■ f A ■ v and t-s-^t-v 
B(u) if r = t ■ f A ■ v and t ■ v = t ■ s ■ u and f B u 

E(t ■ s ■ u) if r = t ■ f A ■ v and t ■ v = t ■ f B ■ u 
E[r) ift^r 
A(v) if r = t ■ v 

E{t -v) if s ^ v and r = t ■ f A ■ v 
B(u) if r = t ■ f A ■ s ■ u and f B ^ u 

E(t ■ s-u) if r = t ■ f A ■ s ■ f B ■ u 



Case t' -< t: Analogously. 
Case t ^ t' and t' yi. t: 
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E[A/t,B/t'](r) = E[A/t][B/update(t',A,t)]{r) 
= E[A/t][B/t'](r) 

E[A/t](r) iit'^r 
= ^ B(u) if r = t' ■ u and f B ^ u 

E[A/t](t'-u) ifr = f-/fl-u 
E(r) if t' ■£ r and t ^ r 

A(v) ift'^r and r = t ■ v and Ja 7^ v 

= < £(f ■ v) if i' 2< r and r = t ■ f A ■ v 
B(u) if r = t' -u and f B ^ u 
E(t' -u) if r = t' ■ f B ■ u 

Note that this is unchanged (up to variable renaming) under swapping of A for B 
and t for t' . That is £L4/i , B/t'} (r ) = E[B/t', A/t] (r ) . □ 

We now return to the main proposition. 

Proposition 2 Ift^t' then E[. .., A/t, B/t' ,.. .] = E[. . . , B/t', A/t, . . .]. 

Proof: The effect of the adjunctions before the two specified in the swap is obviously 
the same on all following adjunctions, so we need only show that 

E[A/t, B/t', C\/h, C fc /t fc ] = E[B/t', A/t, d/t u C k /t k ] 

without loss of generality. We examine the effect of the A and B adjunctions on the tree 
address U for each d separately. In the case of the former adjunction order 

E[A/t,B/t',...,Ci/ti,...] 

= E[A/t}[B/update(t',A,t), . . . ,C t /update(U, A,t), . . .] 

= E[A/t][B/update(t' ,A,t)][. . .,Ci/update(update(U,A,t),B,update(t',A,t)), . . .] 
= E[A/t, B/t'][. . . , Ci/update(update(ti, A, t),B, update(t' , A, t)), . . .} 

and for the latter adjunction order: 

E[B/t',A/t,...,d/U,...} 

= E[B/t'][A/update(t, B, t'), C t /update{U, B, t'), . . .] 

= E[B/t'][A/update{t,B,t')][. . . ,C l /update(update(t l , B,t'), A,update(t, B,t')), . . .] 
= E[B/t',A/t}[. . . ,Ci/update(update(ti, B,t'), A,update(t, B,t')), . . .] 
= E[A/t, B/t'] [..., Ci/update(update(ti, B, t'), A, update(t, B, t')), . . .] 

This last step holds by virtue of the lemma. 
Thus, it suffices to show that 

update(update(ti, A, t),B, update(t' , A, t)) = update(update(ti, B, t'),A, update(t, B, t')) 

Again, we perform a case analysis depending on the prefix relationships of t, t' , and 
ti. Note that we make use of the fact that if t -< t' then (t' — t) ■ s = t' ■ s — t. 

Case t -< t': 
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Subcase t' -<U: 

update(update(ti, A, t), B, update(t' , A, t)) 

= update{t ■ f A ■ (U - t), B, t-f A - (f - t)) 
= t-f A - (f - t) ■ f B ■ (u - f) 
= t-f A - (*' ■ f B ■ (u - 1') - t) 

= update(t' ■ f B ■ (U - t'), A, t) 

= update(update(U, B, t'),A, update(t, B, t')) 

Subcase t' -fi ti and t -< t^. 

update(update(ti, A, t), B, update(t', A, t)) 

= update(t ■ f A ■ (U - t), B,t-f A - (f - t)) 
= t-f A -(U-t) 
= update(ti, A, t) 

= update(update(t i} B, t'),A, update(t 7 B, t')) 

Subcase t' -fc ti and t yKti'. 

update(update(U, A, t),B, update(t' } A, t j) 
= update(ti, B,t- f A - (f - t)) 
= U 

= update(ti, A,t ■ fs ■ (t 1 — t)) 

= update(update(ti, B, t'), A, update(t, B, t')) 

Case t' ~< t: The proof is as for the previous case with t for t' and vice versa. 
Case t yi, t' and t' yK t: 

Subcase t ~< U: We can conclude from the assumptions that t' yKti. Then 

update(update(ti, A,t),B, update(t' } A, t)) 
= update(t ■ f A ■ (ti — t), B, t') 
= t-f A - (U - t) 
= update(ti, A, t) 

= update(update(ti, B, t'),A, update(t, B, t' j) 

Subcase t yK ti and t' -< U: The proof is as for the previous subcase with t for t' 
and vice versa. 

Subcase t yK U and t' yK U: 

update(update(t i} A, t),B, update(t' , A, t)) 
= update(ti, B,t') 
= U 

= update(ti, A, t) 

= update(update(ti, B, t'), A, update(t, B, t')) 
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